-----------------Epoch-----------------
A 4am crack                  2016-10-10
-------------------. updated 2020-06-24
                   |___________________

Name: Epoch
Genre: arcade
Year: 1981
Authors: Larry Miller
Publisher: Sirius Software
Media: single-sided 5.25-inch floppy
OS: custom
Previous cracks: several uncredited
  file cracks
Similar cracks:
  #516 Outpost
  #315 Beer Run

                   ~

               Chapter 0
 In Which Various Automated Tools Fail
          In Interesting Ways


COPYA
  immediate disk read error

Locksmith Fast Disk Backup
  unable to read any track

EDD 4 bit copy (no sync, no count)
  hangs during boot

Copy ][+ nibble editor
  track 0 has some 4-4 encoded data
  other tracks are unreadable

Disk Fixer
  nope (can't read 4-4 encoded tracks)

Why didn't COPYA work?
  not a 16-sector disk

Why didn't Locksmith FDB work?
  ditto

Why didn't my EDD copy work?
  I don't know. Could be a nibble check
  during boot. Could be that the data
  is loaded from half tracks. Could be
  both, or neither.

Next steps:

  1. Trace the boot
  2. Capture the game in memory
  3. Write it out to a standard disk
     with some kind of fastloader

                   ~

               Chapter 1
  In Which We Find A Very Unfriendly
         "Do Not Disturb" Sign


[S6,D1=original disk]
[S5,D1=my work disk]

]PR#5
CAPTURING BOOT0
...reboots slot 6...
...reboots slot 5...
SAVING BOOT0

]BLOAD BOOT0,A$800
]CALL -151

*801L

; display hi-res graphics page
; (uninitialized)
0801-   8D 50 C0    STA   $C050
0804-   8D 52 C0    STA   $C052
0807-   8D 54 C0    STA   $C054
080A-   8D 57 C0    STA   $C057

; get slot (x16)
080D-   A6 2B       LDX   $2B

; a counter? or an address?
080F-   A9 04       LDA   #$04
0811-   85 11       STA   $11
0813-   A0 00       LDY   #$00
0815-   84 10       STY   $10

; look for custom prologue ("DD AD DA")
0817-   BD 8C C0    LDA   $C08C,X
081A-   10 FB       BPL   $0817
081C-   C9 DD       CMP   #$DD
081E-   D0 F7       BNE   $0817
0820-   BD 8C C0    LDA   $C08C,X
0823-   10 FB       BPL   $0820
0825-   C9 AD       CMP   #$AD
0827-   D0 F3       BNE   $081C
0829-   BD 8C C0    LDA   $C08C,X
082C-   10 FB       BPL   $0829
082E-   C9 DA       CMP   #$DA
0830-   D0 EA       BNE   $081C

; read 4-4 encoded data immediately
; (no address field, no sector numbers)
0832-   BD 8C C0    LDA   $C08C,X
0835-   10 FB       BPL   $0832
0837-   38          SEC
0838-   2A          ROL
0839-   85 0E       STA   $0E
083B-   BD 8C C0    LDA   $C08C,X
083E-   10 FB       BPL   $083B
0840-   25 0E       AND   $0E

; ($10) is an address, initialized at
; $080F as $0400 (yes, the text page)
0842-   91 10       STA   ($10),Y
0844-   C8          INY
0845-   D0 EB       BNE   $0832
0847-   E6 11       INC   $11
0849-   A5 11       LDA   $11

; loop until we hit page 8 (i.e. we're
; filling $0400..$07FF)
084B-   C9 08       CMP   #$08
084D-   D0 E3       BNE   $0832
084F-   BD 80 C0    LDA   $C080,X

; clear $0900..$BFFF in main memory
0852-   A9 09       LDA   #$09
0854-   85 01       STA   $01
0856-   A9 00       LDA   #$00
0858-   85 00       STA   $00
085A-   A8          TAY
085B-   A2 B7       LDX   #$B7
085D-   91 00       STA   ($00),Y
085F-   C8          INY
0860-   D0 FB       BNE   $085D
0862-   E6 01       INC   $01
0864-   CA          DEX
0865-   D0 F6       BNE   $085D

; calculate a checksum of page 8 (this
; code right here)
0867-   8A          TXA
0868-   E8          INX
0869-   F0 06       BEQ   $0871
086B-   5D 00 08    EOR   $0800,X
086E-   4C 68 08    JMP   $0868

; use the stack pointer (!) to keep a
; copy of that checksum
0871-   AA          TAX
0872-   9A          TXS

; calculate another checksum of zero
; page
0873-   A2 00       LDX   #$00
0875-   8A          TXA
0876-   55 00       EOR   $00,X
0878-   E8          INX
0879-   D0 FB       BNE   $0876

; get slot (x16) again
087B-   A6 2B       LDX   $2B

; jump to the code we just read into
; the text page
087D-   4C 00 04    JMP   $0400

Well that's lovely. I want to interrupt
the boot at $087D, but if I do, it will
modify the checksum that ends up in the
stack pointer.

It's also wiping main memory, including
the place I usually put my boot trace
callbacks (around $9700).

So, a three-pronged attack:

1. Relocate the code to $0900. Most of
   it uses relative branching already,
   except for one JMP at $086E, which I
   can patch. The code will still run,
   but I'll be able to patch it without
   altering the checksum.
2. Disable the memory wipe at $095D.
3. Patch the code at $097D to jump to a
   routine under my control.

                   ~

               Chapter 2
       In Which Nothing Happens,
             Inhospitably


*9600<C600.C6FFM

; relocate the code from $0800 to $0900
96F8-   A0 00       LDY   #$00
96FA-   B9 00 08    LDA   $0800,Y
96FD-   99 00 09    STA   $0900,Y
9700-   C8          INY
9701-   D0 F7       BNE   $96FA

; disable the memory wipe by changing
; STA to BIT
9703-   A9 24       LDA   #$24
9705-   8D 5D 09    STA   $095D

; fix the absolute JMP address
9708-   A9 09       LDA   #$09
970A-   8D 70 09    STA   $0970

; set up the callback
970D-   A9 1A       LDA   #$1A
970F-   8D 7E 09    STA   $097E
9712-   A9 97       LDA   #$97
9714-   8D 7F 09    STA   $097F

; start the boot
9717-   4C 01 09    JMP   $0901

; callback is here
; copy the code on the text page to
; higher memory so it will survive a
; reboot
971A-   A2 04       LDX   #$04
971C-   A0 00       LDY   #$00
971E-   B9 00 04    LDA   $0400,Y
9721-   99 00 24    STA   $2400,Y
9724-   C8          INY
9725-   D0 F7       BNE   $971E
9727-   EE 20 97    INC   $9720
972A-   EE 23 97    INC   $9723
972D-   CA          DEX
972E-   D0 EE       BNE   $971E

; turn off slot 6 drive motor and
; reboot to my work disk in slot 5
9730-   AD E8 C0    LDA   $C0E8
9733-   4C 00 C5    JMP   $C500

*BSAVE TRACE,A$9600,L$136
*9600G
...reboots slot 6...
...reboots slot 5...

]BSAVE OBJ.0400-07FF,A$2400,L$400
]CALL -151

I'm going to leave this code at $2400.
Relative branches will look correct,
but absolute addresses will be off by
$2000.

*2400L

; calculate another checksum of zero
; page, starting with the value of the
; previous checksum (at $0873)
2400-   A0 00       LDY   #$00
2402-   59 00 00    EOR   $0000,Y
2405-   C8          INY
2406-   D0 FA       BNE   $2402
2408-   A8          TAY

; if equal, nothing has changed (we've
; EOR'd everything twice, so we're back
; to zero)
2409-   F0 03       BEQ   $240E

; if checksums don't match, jump to
; (what I presume is) The Badlands
240B-   4C 40 05    JMP   $0540

*2540L

; clear most of main memory, starting
; at $0C00
2540-   A0 00       LDY   #$00
2542-   84 00       STY   $00
2544-   A9 0C       LDA   #$0C
2546-   85 01       STA   $01
2548-   A2 B4       LDX   #$B4
254A-   98          TYA
254B-   91 00       STA   ($00),Y
254D-   C8          INY
254E-   D0 FB       BNE   $254B
2550-   E6 01       INC   $01
2552-   CA          DEX
2553-   D0 F6       BNE   $254B

; play a cute sound
2555-   A9 C0       LDA   #$C0
2557-   85 00       STA   $00
2559-   A0 C0       LDY   #$C0
255B-   AD 30 C0    LDA   $C030
255E-   A6 00       LDX   $00
2560-   CA          DEX
2561-   D0 FD       BNE   $2560
2563-   88          DEY
2564-   D0 F5       BNE   $255B
2566-   46 00       LSR   $00
2568-   D0 EF       BNE   $2559

; and reboot
256A-   A6 2B       LDX   $2B
256C-   CA          DEX
256D-   8A          TXA
256E-   4A          LSR
256F-   4A          LSR
2570-   4A          LSR
2571-   4A          LSR
2572-   09 C0       ORA   #$C0
2574-   48          PHA
2575-   A9 FF       LDA   #$FF
2577-   48          PHA
2578-   60          RTS

Continuing from $040E...

; set reset vector to The Badlands
240E-   A9 40       LDA   #$40
2410-   8D F2 03    STA   $03F2
2413-   A9 05       LDA   #$05
2415-   8D F3 03    STA   $03F3
2418-   49 A5       EOR   #$A5
241A-   8D F4 03    STA   $03F4
241D-   86 2B       STX   $2B
241F-   EA          NOP

; read from ROM but write to RAM bank 2
2420-   AD 81 C0    LDA   $C081
2423-   AD 81 C0    LDA   $C081

; wipe RAM bank 2 by copying ROM
2426-   A0 00       LDY   #$00
2428-   84 00       STY   $00
242A-   A9 D0       LDA   #$D0
242C-   85 01       STA   $01
242E-   B1 00       LDA   ($00),Y
2430-   91 00       STA   ($00),Y
2432-   C8          INY
2433-   D0 F9       BNE   $242E
2435-   E6 01       INC   $01
2437-   D0 F5       BNE   $242E

; set low-level reset vector while the
; language card RAM is writeable (also
; to The Badlands)
2439-   A9 40       LDA   #$40
243B-   8D FC FF    STA   $FFFC
243E-   A9 05       LDA   #$05
2440-   8D FD FF    STA   $FFFD

; switch back to ROM
2443-   AD 80 C0    LDA   $C080

; set input and output vectors to
; something unfriendly
2446-   A9 A2       LDA   #$A2
2448-   85 36       STA   $36
244A-   85 38       STA   $38
244C-   A9 05       LDA   #$05
244E-   85 37       STA   $37
2450-   85 39       STA   $39

; take the checksum from boot0 (that we
; stashed in the stack pointer) and put
; it in zero page $0B
2452-   A9 00       LDA   #$00
2454-   BA          TSX
2455-   86 0B       STX   $0B
2457-   85 0C       STA   $0C
2459-   85 0D       STA   $0D
245B-   85 0E       STA   $0E

; use that checksum (now in zero page
; $0B) as the starting value of ANOTHER
; checksum of all the code on the text
; page (including this code right here)
245D-   A5 0B       LDA   $0B
245F-   A2 00       LDX   #$00
2461-   5D 00 04    EOR   $0400,X
2464-   5D 00 05    EOR   $0500,X
2467-   5D 00 06    EOR   $0600,X
246A-   5D 00 07    EOR   $0700,X
246D-   E8          INX
246E-   D0 F1       BNE   $2461
2470-   AA          TAX

; and put the new checksum back into
; the stack pointer
2471-   9A          TXS

And that's where I get to interrupt the
boot again.

                   ~

               Chapter 3
    You're Very Clever, Young Man,
  But It's Checksums All The Way Down


*9600<C600.C6FFM

; move boot0 to $0900 and patch it up
96F8-   A0 00       LDY   #$00
96FA-   B9 00 08    LDA   $0800,Y
96FD-   99 00 09    STA   $0900,Y
9700-   C8          INY
9701-   D0 F7       BNE   $96FA
9703-   A9 24       LDA   #$24
9705-   8D 5D 09    STA   $095D
9708-   A9 09       LDA   #$09
970A-   8D 70 09    STA   $0970

; set up callback after first checksum
; is calculated
970D-   A9 1A       LDA   #$1A
970F-   8D 7E 09    STA   $097E
9712-   A9 97       LDA   #$97
9714-   8D 7F 09    STA   $097F

; start the boot
9717-   4C 01 09    JMP   $0901

; callback is here
; save the checksum and unconditionally
; break to the monitor
971A-   BA          TSX
971B-   8A          TXA
971C-   8D FF 97    STA   $97FF
971F-   AD E8 C0    LDA   $C0E8
9722-   4C 59 FF    JMP   $FF59

*BSAVE TRACE 0872 CHECKSUM,A$9600,L$125
*9600G
...reboots slot 6...
<beep>

*97FF

97FF- 20

The initial checksum of boot0 is $20.

*C500G
...
]CALL -151

*9600<C600.C6FFM

; move boot0 to $0900 and patch it up
96F8-   A0 00       LDY   #$00
96FA-   B9 00 08    LDA   $0800,Y
96FD-   99 00 09    STA   $0900,Y
9700-   C8          INY
9701-   D0 F7       BNE   $96FA
9703-   A9 24       LDA   #$24
9705-   8D 5D 09    STA   $095D
9708-   A9 09       LDA   #$09
970A-   8D 70 09    STA   $0970

; set up callback instead of jumping to
; boot1 at $0400
970D-   A9 1A       LDA   #$1A
970F-   8D 7E 09    STA   $097E
9712-   A9 97       LDA   #$97
9714-   8D 7F 09    STA   $097F

; start the boot
9717-   4C 01 09    JMP   $0901

; (callback is here)
; hard-code the initial checksum value
; ($20), then reproduce the checksum on
; the boot1 code before we start
; patching it to high heaven
971A-   A2 20       LDX   #$20
971C-   A9 00       LDA   #$00
971E-   86 0B       STX   $0B
9720-   85 0C       STA   $0C
9722-   85 0D       STA   $0D
9724-   85 0E       STA   $0E
9726-   A5 0B       LDA   $0B
9728-   A2 00       LDX   #$00
972A-   5D 00 04    EOR   $0400,X
972D-   5D 00 05    EOR   $0500,X
9730-   5D 00 06    EOR   $0600,X
9733-   5D 00 07    EOR   $0700,X
9736-   E8          INX
9737-   D0 F1       BNE   $972A

; store the new checksum and break
9739-   8D FF 97    STA   $97FF
973C-   AD E8 C0    LDA   $C0E8
973F-   4C 59 FF    JMP   $FF59

*BSAVE TRACE 0470 CHECKSUM,A$960,L$142
*9600G
...reboots slot 6...
<beep>

*97FF

97FF- 25

The second checksum, which gets stashed
in the stack pointer at $0471, is $25.

                   ~

               Chapter 4
   Half A Track Is Better Than None


Continuing the boot trace at $0472...

*C500G
...
]BLOAD BOOT1 0400-07FF,A$2400
]CALL -151

*2472L

2472-   A0 03       LDY   #$03
2474-   20 DC 04    JSR   $04DC

*24DCL

; advance drive head by one phase
; (a.k.a. a half track)
24DC-   E6 0C       INC   $0C
24DE-   A5 0C       LDA   $0C
24E0-   29 03       AND   #$03
24E2-   0A          ASL
24E3-   05 2B       ORA   $2B
24E5-   AA          TAX
24E6-   BD 81 C0    LDA   $C081,X
24E9-   20 F8 04    JSR   $04F8
24EC-   BD 80 C0    LDA   $C080,X
24EF-   20 F8 04    JSR   $04F8

; loop a number of times (given in the
; Y register on entry)
24F2-   88          DEY
24F3-   D0 E7       BNE   $24DC
24F5-   A6 2B       LDX   $2B
24F7-   60          RTS

; wait routine (called from $04E9 and
; $04EF)
24F8-   A9 40       LDA   #$40
24FA-   8D 50 C0    STA   $C050
24FD-   4C A8 FC    JMP   $FCA8

We started on track 0 and advanced the
drive head by 3 phases, so now we're
on track 1.5.

Continuing from $0477...

; get target memory page from an array
; at $05D0
2477-   A4 0E       LDY   $0E
2479-   B9 D0 05    LDA   $05D0,Y

; if page = 0, jump to next stage
; at $0500, otherwise continue at $0481
247C-   D0 03       BNE   $2481
247E-   4C 00 05    JMP   $0500
2481-   20 90 04    JSR   $0490

*2490L

; sector count (4-4 encoded tracks can
; only hold $0C pages worth of data)
2490-   85 05       STA   $05
2492-   18          CLC
2493-   A9 0C       LDA   #$0C
2495-   85 06       STA   $06
2497-   A0 00       LDY   #$00
2499-   84 04       STY   $04

; match custom prologue "DD AD DA"
249B-   BD 8C C0    LDA   $C08C,X
249E-   10 FB       BPL   $249B
24A0-   C9 D5       CMP   #$D5
24A2-   D0 F7       BNE   $249B
24A4-   BD 8C C0    LDA   $C08C,X
24A7-   10 FB       BPL   $24A4
24A9-   C9 AA       CMP   #$AA
24AB-   D0 F3       BNE   $24A0
24AD-   BD 8C C0    LDA   $C08C,X
24B0-   10 FB       BPL   $24AD
24B2-   C9 DA       CMP   #$DA
24B4-   D0 EA       BNE   $24A0

; now read 4-4 encoded data into ($04)
24B6-   BD 8C C0    LDA   $C08C,X
24B9-   10 FB       BPL   $24B6
24BB-   38          SEC
24BC-   2A          ROL
24BD-   85 0F       STA   $0F
24BF-   8D 50 C0    STA   $C050
24C2-   BD 8C C0    LDA   $C08C,X
24C5-   10 FB       BPL   $24C2
24C7-   25 0F       AND   $0F
24C9-   91 04       STA   ($04),Y
24CB-   C8          INY
24CC-   D0 E8       BNE   $24B6

; increment target page
24CE-   E6 05       INC   $05

; decrement count
24D0-   C6 06       DEC   $06

; Loop back to read more. Note: this
; goes directly to data read routine,
; not the prologue match routine. There
; is only one prologue per track.
24D2-   D0 E2       BNE   $24B6
24D4-   60          RTS

Continuing from $0484...

*2484L

; not shown, but the subroutine
; sets Y=2 and falls through to drive
; head advance routine, so this will
; skip ahead 2 phases = 1 whole track,
; so we're still on half tracks but now
; 2.5, 3.5, 4.5, &c.
2484-   20 D8 04    JSR   $04D8

; show hi-res screen, increment index
; into page array, and jump back to
; read the next track
2487-   20 00 06    JSR   $0600

*2600L

; set some addresses that are likely to
; be important later
2600-   A9 00       LDA   #$00
2602-   8D D1 6D    STA   $6DD1
2605-   8D D2 6D    STA   $6DD2
2608-   8D D3 6D    STA   $6DD3
260B-   A9 60       LDA   #$60
260D-   8D 7F 7F    STA   $7F7F
2610-   A9 05       LDA   #$05
2612-   85 00       STA   $00
2614-   A9 3B       LDA   #$3B
2616-   85 01       STA   $01
2618-   A9 C0       LDA   #$C0
261A-   85 02       STA   $02
261C-   A9 84       LDA   #$84
261E-   85 03       STA   $03
2620-   A9 95       LDA   #$95
2622-   85 05       STA   $05
2624-   A9 14       LDA   #$14
2626-   85 06       STA   $06
2628-   A0 15       LDY   #$15
262A-   B1 01       LDA   ($01),Y
262C-   A9 55       LDA   #$55
262E-   85 FE       STA   $FE
2630-   A9 FD       LDA   #$FD
2632-   85 FF       STA   $FF
2634-   60          RTS

If I know anything about anything, that
will prove to be important later.

Continuing from $048A...

; increment page index
248A-   E6 0E       INC   $0E

; and branch back (exits via $0500 when
; the target page = 0)
248C-   4C 77 04    JMP   $0477

Here is the target page table (accessed
at $0479):

*25D0.

25D0- 0C 18 24 30 3C 48 54 60
25D8- 6C 78 84 90 9C A8 B4 00

Each call to $0490 reads $0C sectors,
so we're filling $0C00..$BFFF entirely.
Once the page array is exhausted, $047E
jumps to $0500 for the next boot stage.

To sum up:

  - We're reading data from consecutive
    half tracks (1.5, 2.5, 3.5, &c.)
  - Each track has $0C pages of data in
    a custom (non-sector-based) format
  - We're filling $0C00..$BFFF in main
    memory
  - Nothing in this read loop relies on
    the checksum we stashed in the
    stack pointer or the later checksum
    we pushed twice to the stack
  - $047E exits via $0500

Let's capture it.

                   ~

               Chapter 5
    In Which Things Have Been Made
    As Difficult As Possible For Us


*9600<C600.C6FFM

*96F8L

; move boot0 to $0900 and patch it up
96F8-   A0 00       LDY   #$00
96FA-   B9 00 08    LDA   $0800,Y
96FD-   99 00 09    STA   $0900,Y
9700-   C8          INY
9701-   D0 F7       BNE   $96FA
9703-   A9 24       LDA   #$24
9705-   8D 5D 09    STA   $095D
9708-   A9 09       LDA   #$09
970A-   8D 70 09    STA   $0970

; set up callback before jumping to
; $0400
970D-   A9 1A       LDA   #$1A
970F-   8D 7E 09    STA   $097E
9712-   A9 97       LDA   #$97
9714-   8D 7F 09    STA   $097F

; start the boot
9717-   4C 01 09    JMP   $0901

; initialize zero page (copied verbatim
; from $0457)
971A-   A9 00       LDA   #$00
971C-   85 0B       STA   $0B
971E-   85 0C       STA   $0C
9720-   85 0D       STA   $0D
9722-   85 0E       STA   $0E

; break to the monitor at $047E instead
; of continuing at $0520
9724-   A9 4C       LDA   #$4C
9726-   8D 7E 04    STA   $047E
9729-   A9 59       LDA   #$59
972B-   8D 7F 04    STA   $047F
972E-   A9 FF       LDA   #$FF
9730-   8D 80 04    STA   $0480
9733-   4C 72 04    JMP   $0472

*BSAVE TRACE2,A$9600,L$136

; fill main memory so I can verify
; which pages changed (in case I made
; a mistake in my analysis earlier!)
*800:FD N 801<800.BEFEM

*BRUN TRACE2
...reboots slot 6...
<beep>

A quick inspection of memory confirms
that $0C00..$BFFF have changed, and the
rest are untouched (except $0800 for
boot0 and the text page for boot1, but
I knew about those already).

*C500G
...

]BSAVE OBJ.0C00-7FFF,A$7400
]BRUN TRACE2
...reboots slot 6...
<beep>

*2000<8000.BFFFM
*C500G
...

]BSAVE OBJ.8000-BEFF,A$2000,L$3F00
]BSAVE OBJ.BF00-BFFF,A$5F00,L$100

That's it; that's the entire game code.
Now back to the bootloader to see where
the entry point is.

]BLOAD OBJ.0400-07FF,A$2400
]CALL -151

*2500L

; turn off drive motor
2500-   BD 88 C0    LDA   $C088,X

; checksum entire game code in memory
2503-   A9 0C       LDA   #$0C
2505-   85 81       STA   $81
2507-   A9 00       LDA   #$00
2509-   85 80       STA   $80
250B-   A8          TAY
250C-   A2 B4       LDX   #$B4
250E-   51 80       EOR   ($80),Y
2510-   C8          INY
2511-   D0 FB       BNE   $250E
2513-   E6 81       INC   $81
2515-   CA          DEX
2516-   D0 F6       BNE   $250E
2518-   A8          TAY

; if checksum fails, it's off to The
; Badlands with you!
2519-   D0 25       BNE   $2540

; transfer the stack pointer to X --
; remember this was set as the result
; of the checksum back at $0470
251B-   BA          TSX

Now X is #$25.

251C-   A0 00       LDY   #$00
251E-   B1 FF       LDA   ($FF),Y
2520-   48          PHA

zp$FF was set to #$FD at $0632. The
behavior of this addressing mode is
strange, though. We're using zp$FF as
the low byte of an address. But zero
page always wraps around, so the high
byte of the address is zp$00, not
$0100. zp$00 was set to #$05 at $0612.
So the address we're loading is $05FD
(+Y, which is 0), in memory now at
$25FD.

*25FD

25FE- 71

And that's what gets pushed to the
stack: #$71.

Continuing...

2521-   C8          INY
2522-   B1 FF       LDA   ($FF),Y
2524-   48          PHA

Same addressing mode, but now Y is 1,
so we're getting the value of $05FE.

*25FE

25FE- 42

So we've pushed #$71/#$42 to the stack.

2525-   8A          TXA
2526-   85 37       STA   $37

Now zp$37 is #$25.

2528-   49 B6       EOR   #$B6
252A-   99 FF 01    STA   $01FF,Y

Now $0200 is #$25 XOR #$B6 = #$93.

; set up the rest of zero page in bulk
252D-   A0 60       LDY   #$60
252F-   B9 00 07    LDA   $0700,Y
2532-   99 00 00    STA   $0000,Y
2535-   C8          INY
2536-   D0 F7       BNE   $252F

; and "exit" via the address we just
; pushed to the stack
2538-   60          RTS

"RTS" pops the two bytes we pushed and
adds 1, so the entry point of the game
is $7143.

*BLOAD OBJ.0C00-7FFF,A$C00
*7143L

7143-   1A          ???
7144-   AC 43 09    LDY   $0943
7147-   7A          ???
7148-   5A          ???
7149-   A8          TAY
714A-   BA          TSX
714B-   A8          TAY
714C-   BA          TSX
714D-   3A          ???
714E-   AE 84 01    LDX   $0184
7151-   3A          ???
7152-   7A          ???
7153-   1A          ???
7154-   AD 99 99    LDA   $9999
7157-   1A          ???
7158-   B8          CLV
7159-   4C 53 0F    JMP   $0F53
715C-   84 75       STY   $75
715E-   3A          ???
715F-   B8          CLV

I've missed something.

                   ~

               Chapter 6
          Undocumented Opcode
            Is Best Opcode


Actually, I haven't missed anything.
All those opcodes that show up as "???"
in the monitor listing are actually
valid (if undocumented) 6502 opcodes.
$1A, $3A, $5A, $7A, and $BA are all
equivalent to $EA -- a NOP. (Somewhat
surprisingly, these opcodes work even
on my enhanced Apple //e with a 65c02
processor.)

So this code does a bunch of seemingly
random things to registers, then
eventually jumps to $0F53.

*F53L

; well at least this looks like code!
; copy The Badlands to $0300
0F53-   A2 40       LDX   #$40
0F55-   BD 40 05    LDA   $0540,X
0F58-   9D 00 03    STA   $0300,X
0F5B-   CA          DEX
0F5C-   10 F7       BPL   $0F55

; switch to RAM
0F5E-   AD 81 C0    LDA   $C081
0F61-   AD 81 C0    LDA   $C081

; set high- and low-level reset vectors
0F64-   A9 00       LDA   #$00
0F66-   8D F2 03    STA   $03F2
0F69-   8D FC FF    STA   $FFFC
0F6C-   A9 03       LDA   #$03
0F6E-   8D F3 03    STA   $03F3
0F71-   8D FD FF    STA   $FFFD
0F74-   49 A5       EOR   #$A5
0F76-   8D F4 03    STA   $03F4

; back to ROM
0F79-   AD 80 C0    LDA   $C080

; and continue to the real entry point
0F7C-   4C 33 81    JMP   $8133

And now I have enough information to
run the game without the bootloader --
and see if I've *really* missed
something.

; get the bootloader back in memory
*BLOAD OBJ.0400-07FF,A$2400

; this will end up on zero page (we'll
; move it later)
*B60<760.7FFM

; load the game
*BLOAD OBJ.0C00-7FFF,A$C00
*BLOAD OBJ.8000-BEFF,A$8000

; load last page in lower memory so it
; doesn't override Diversi-DOS (we'll
; move it later)
*BLOAD OBJ.BF00-BFFF,A$800

Now a short loader program that
initializes zero page and jumps to the
real entry point.

0B00-   A9 25       LDA   #$25
0B02-   85 37       STA   $37
0B04-   49 B6       EOR   #$B6
0B06-   8D 00 02    STA   $0200
0B09-   A0 60       LDY   #$60
0B0B-   B9 00 0B    LDA   $0B00,Y
0B0E-   99 00 00    STA   $0000,Y
0B11-   C8          INY
0B12-   D0 F7       BNE   $0B0B
0B14-   4C 33 81    JMP   $8133

*BSAVE LOADER,A$B00,L$100

; disconnect DOS
*FE89G FE93G

; move the last page into place
*BF00<800.8FFM

; and run our custom loader
*B00G
...works, and it is glorious...

                   ~

               Chapter 7
   In Which We Step, Ever So Gently,
         Into The 21st Century


I have all the game code. I know how to
initialize it and call it. Now to write
it all to disk. (We'll worry about
reading it back in just a minute.)

[S6,D1=blank formatted disk]
[S5,D1=my work disk]

]PR#5
...
]CALL -151

; page count (decremented)
0300-   A9 90       LDA   #$B5
0302-   85 FF       STA   $FF

; logical sector (incremented)
0304-   A9 00       LDA   #$00
0306-   85 FE       STA   $FE

; call RWTS to write sector
0308-   A9 03       LDA   #$03
030A-   A0 88       LDY   #$88
030C-   20 D9 03    JSR   $03D9

; increment logical sector, wrap around
; from $0F to $00 and increment track
030F-   E6 FE       INC   $FE
0311-   A4 FE       LDY   $FE
0313-   C0 10       CPY   #$10
0315-   D0 07       BNE   $031E
0317-   A0 00       LDY   #$00
0319-   84 FE       STY   $FE
031B-   EE 8C 03    INC   $038C

; convert logical to physical sector
031E-   B9 40 03    LDA   $0340,Y
0321-   8D 8D 03    STA   $038D

; increment page to write
0324-   EE 91 03    INC   $0391

; loop until done with all $90 pages
0327-   C6 FF       DEC   $FF
0329-   D0 DD       BNE   $0308
032B-   60          RTS

*340.34F

; logical to physical sector mapping
0340- 00 07 0E 06 0D 05 0C 04
0348- 0B 03 0A 02 09 01 08 0F

*388.397

; RWTS parameter table, pre-initialized
; with slot 6, drive 1, track $01,
; sector $00, address $0A00, and RWTS
; write command ($02)
0388- 01 60 01 00 01 00 FB F7
0390- 00 0A 00 00 02 00 00 60

*BSAVE MAKE,A$300,L$98

; load everything off-by-$100 so we
; leave $BF00+ untouched (this is the
; only page in main memory used by
; Diversi-DOS 64K)
*BLOAD LOADER,A$A00
*BLOAD OBJ.0C00-7FFF,A$B00
*BLOAD OBJ.8000-BEFF,A$7F00
*BLOAD OBJ.BF00-BFFF,A$BE00

[S6,D1=blank disk]

*300G        ; write game to disk

Now I have the entire game on tracks
$01-$0C of a standard 16-sector disk.
To read it back as quickly as possible,
I'll use qkumba's "0boot" bootloader,
newly updated to version 2.0 with
support for partial tracks.

                   ~

               Chapter 8
               0boot 2.0


0boot lives on track $00, just like me.
Sector $00 (boot0) reuses the disk
controller ROM routine to read sector
$0E (boot1). Boot0 creates a few data
tables, copys boot1 to zero page,
modifies it to accomodate booting from
any slot, and jumps to it.

Boot0 is loaded at $0800 by the disk
controller ROM routine.

; tell the ROM to load only this sector
; (we'll do the rest manually)
0800-  [01]

; The accumulator is $01 after loading
; sector $00, or $03 after loading
; sector $0E. We don't need to preserve
; the value, so we just shift the bits
; to determine whether this is the
; first or second time we've been here.
0801-   4A          LSR

; second run -- we've loaded boot1, so
; skip to boot1 initialization routine
0802-   D0 0E       BNE   $0812

; first run -- increment the physical
; sector to read (this will be the next
; sector under the drive head, so we'll
; waste as little time as possible
; waiting for the disk to spin)
0804-   E6 3D       INC   $3D

; X holds the boot slot (x16) --
; munge it into $Cx format (e.g. $C6
; for slot 6, but we need to accomodate
; booting from any slot)
0806-   8A          TXA
0807-   4A          LSR
0808-   4A          LSR
0809-   4A          LSR
080A-   4A          LSR
080B-   09 C0       ORA   #$C0

; push address (-1) of the sector read
; routine in the disk controller ROM
080D-   48          PHA
080E-   A9 5B       LDA   #$5B
0810-   48          PHA

; "return" via disk controller ROM,
; which reads boot1 into $0900 and
; exits via $0801
0811-   60          RTS

; Execution continues here (from $0802)
; after boot1 code has been loaded into
; $0900. This works around a bug in the
; CFFA 3000 firmware that doesn't
; guarantee that the Y register is
; always $00 at $0801, which is exactly
; the sort of bug that qkumba enjoys
; uncovering.
0812-   A8          TAY

; munge the boot slot, e.g. $60 -> $EC
; (to be used later)
0813-   8A          TXA
0814-   09 8C       ORA   #$8C

; Copy the boot1 code from $0901..$09FF
; to zero page. ($0900 holds the 0boot
; version number. This is version 1.
; $0000 is initialized later in boot1.)
0816-   BE 00 09    LDX   $0900,Y
0819-   96 00       STX   $00,Y
081B-   C8          INY
081C-   D0 F8       BNE   $0816

; There are a number of places in boot1
; that need to hit a slot-specific soft
; switch (read a nibble from disk, turn
; off the drive, &c). Rather than the
; usual form of "LDA $C08C,X", we will
; use "LDA $C0EC" and modify the $EC
; byte in advance, based on the boot
; slot. $00F5 is an array of all the
; places in the boot1 code that need
; this adjustment.
081E-   C8          INY
081F-   B6 E3       LDX   $E3,Y
0821-   95 00       STA   $00,X
0823-   D0 F9       BNE   $081E

; munge $EC -> $E0 (used later to
; advance the drive head to the next
; track)
0825-   29 F0       AND   #$F0
0827-   85 CB       STA   $CB

; munge $E0 -> $E8 (used later to
; turn off the drive motor)
0829-   09 08       ORA   #$08
082B-   85 D9       STA   $D9

; push several addresses to the stack
; (more on this later)
082D-   A2 06       LDX   #$06
082F-   B5 DD       LDA   $DD,X
0831-   48          PHA
0832-   CA          DEX
0833-   D0 FA       BNE   $082F

; number of tracks to load (x2) (game-
; specific; this game uses $0C tracks)
0835-   A0 18       LDY   #$18

; push $0003 to the stack (more on this
; later)
0837-   8A          TXA
0838-   48          PHA
0839-   A9 03       LDA   #$03
083B-   48          PHA
083C-   8A          TXA

; unconditional branch over the next
; loop
083D-   18          CLC
083E-   90 07       BCC   $0847

; loop starts here
0840-   8A          TXA

; every other time through this loop,
; we will end up taking this branch
0841-   90 03       BCC   $0846

; X is 0 going into this loop, and it
; never changes, so A is always 0 too.
; So this will push $0000 to the stack
; (to "return" to $0001, which reads a
; track into memory)
0843-   48          PHA
0844-   48          PHA

; There's a "SEC" hidden here (because
; it's opcode $38), but it's only
; executed if we take the branch at
; $0841, which lands at $0846, which is
; in the middle of this instruction.
; Otherwise we execute the compare,
; which clears the carry bit. So the
; carry flip-flops between set and
; clear, so the BCC at $0841 is only
; taken every other time.
0845-   C9 38       CMP   #$38

; Push $00B6 to the stack, to "return"
; to $00B7. This routine advances the
; drive head to the next half track.
0847-   48          PHA
0848-   A9 B6       LDA   #$B6
084A-   48          PHA

; loop until done
084B-   88          DEY
084C-   D0 F2       BNE   $0840

Because of the carry flip-flop, we will
push $00B6 to the stack every time
through the loop, but we will only push
$0000 every other time. The loop runs
for twice the number of tracks we want
to read, so the stack ends up looking
like this (remember all addresses are
off-by-1 because of how the Apple II
"returns" to stack addresses):

 --top--
  $00B6 (move drive 1/2 track)
  $00B6 (move drive another 1/2 track)
  $0000 (read track into memory)
  $00B6 \
  $00B6  } second group
  $0000 /
  $00B6 \
  $00B6  } third group
  $0000 /
  .
  . [repeated for each track]
  .
  $00B6 \
  $00B6  } final group
  $0000 /
  $0003 entry point to read the last
        few sectors from the final
        track (we can't read the entire
        track into memory because we'd
        end up overwriting $C000 ROM
        space and wreaking havoc with
        softswitches)
  $FE88 IN#0 (this and the following
        two addresses were pushed to
        the stack in the loop at $082F)
  $FE92 PR#0
  $00D7 turn off drive motor and jump
        to my game-specific custom
        loader at $0B00
--bottom--

Boot1 reads the game into memory from
tracks $01-$0C, but it isn't a loop.
It's one routine that reads a track and
another routine that advances the drive
head. We're essentially unrolling the
read loop on the stack, in advance, so
that each routine gets called as many
times as we need, when we need it. Like
dancers in a chorus line, each routine
executes then cedes the spotlight. Each
seems unaware of the others, but in
reality they've all been meticulously
choreographed.

                   ~

               Chapter 9
                 6 + 2


Before I can explain the next chunk of
code, I need to pause and explain a
little bit of theory. As you probably
know if you're the sort of person who
reads this sort of thing, Apple II
floppy disks do not contain the actual
data that ends up being loaded into
memory. Due to hardware limitations of
the original Disk II drive, data on
disk must be stored in an intermediate
format called "nibbles." Bytes in
memory are encoded into nibbles before
writing to disk, and nibbles that you
read from the disk must be decoded back
into bytes. The round trip is lossless
but requires some bit wrangling.

Decoding nibbles-on-disk into bytes-in-
memory is a multi-step process. In
"6-and-2 encoding" (used by DOS 3.3,
ProDOS, and all ".dsk" image files),
there are 64 possible values that you
may find in the data field (in the
range $96..$FF, but not all of those,
because some of them have bit patterns
that trip up the drive firmware). We'll
call these "raw nibbles."

Step 1: read $156 raw nibbles from the
data field. These values will range
from $96 to $FF, but as mentioned
earlier, not all values in that range
will appear on disk.

Now we have $156 raw nibbles.

Step 2: decode each of the raw nibbles
into a 6-bit byte between 0 and 63
(%00000000 and %00111111 in binary).
$96 is the lowest valid raw nibble, so
it gets decoded to 0. $97 is the next
valid raw nibble, so it's decoded to 1.
$98 and $99 are invalid, so we skip
them, and $9A gets decoded to 2. And so
on, up to $FF (the highest valid raw
nibble), which gets decoded to 63.

Now we have $156 6-bit bytes.

Step 3: split up each of the first $56
6-bit bytes into pairs of bits. In
other words, each 6-bit byte becomes
three 2-bit bytes. These 2-bit bytes
are merged with the next $100 6-bit
bytes to create $100 8-bit bytes. Hence
the name, "6-and-2" encoding.

The exact process of how the bits are
split and merged is... complicated. The
first $56 6-bit bytes get split up into
2-bit bytes, but those two bits get
swapped (so %01 becomes %10 and vice-
versa). The other $100 6-bit bytes each
get multiplied by 4 (a.k.a. bit-shifted
two places left). This leaves a hole in
the lower two bits, which is filled by
one of the 2-bit bytes from the first
group.

A diagram might help. "a" through "x"
each represent one bit.

             -------------

1 decoded      3 decoded
nibble in  +   nibbles in   =  3 bytes
first $56      other $100


00abcdef       00ghijkl
               00mnopqr
   |           00stuvwx
   |
 split            |
   &           shifted
swapped        left x2
   |              |
   V              V

000000fe   +   ghijkl00   =   ghijklfe
000000dc   +   mnopqr00   =   mnopqrdc
000000ba   +   stuvwx00   =   stuvwxba

             -------------

Tada! Four 6-bit bytes

  00abcdef
  00ghijkl
  00mnopqr
  00stuvwx

become three 8-bit bytes

  ghijklfe
  mnopqrdc
  stuvwxba

When DOS 3.3 reads a sector, it reads
the first $56 raw nibbles, decoded them
into 6-bit bytes, and stashes them in a
temporary buffer (at $BC00). Then it
reads the other $100 raw nibbles,
decodes them into 6-bit bytes, and puts
them in another temporary buffer (at
$BB00). Only then does DOS 3.3 start
combining the bits from each group to
create the full 8-bit bytes that will
end up in the target page in memory.
This is why DOS 3.3 "misses" sectors
when it's reading, because it's busy
twiddling bits while the disk is still
spinning.

                   ~

              Chapter 10
             Back to 0boot


0boot also uses "6-and-2" encoding. The
first $56 nibbles in the data field are
still split into pairs of bits that
need to be merged with nibbles that
won't come until later. But instead of
waiting for all $156 raw nibbles to be
read from disk, it "interleaves" the
nibble reads with the bit twiddling
required to merge the first $56 6-bit
bytes and the $100 that follow. By the
time 0boot gets to the data field
checksum, it has already stored all
$100 8-bit bytes in their final resting
place in memory. This means that 0boot
can read all 16 sectors on a track in
one revolution of the disk. That's
crazy fast.

To make it possible to do all the bit
twiddling we need to do and not miss
nibbles as the disk spins(*), we do
some of the work earlier. We multiply
each of the 64 possible decoded values
by 4 and store those values. (Since
this is accomplished by bit shifting
and we're doing it before we start
reading the disk, this is called the
"pre-shift" table.) We also store all
possible 2-bit values in a repeating
pattern that will make it easy to look
them up later. Then, as we're reading
from disk (and timing is tight), we can
simulate all the bit math we need to do
with a series of table lookups. There
is just enough time to convert each raw
nibble into its final 8-bit byte before
reading the next nibble.

(*) The disk spins independently of the
    CPU, and we only have a limited
    time to read a nibble and do what
    we're going to do with it before
    WHOOPS HERE COMES ANOTHER ONE. So
    time is of the essence. Also, "As
    The Disk Spins" would make a great
    name for a retrocomputing-themed
    soap opera. I am going to continue
    making this joke until someone
    makes it happen, then I promise I
    will stop.

The first table, at $0200..$02FF, is
three columns wide and 64 rows deep.
Astute readers will notice that 3 x 64
is not 256. Only three of the columns
are used; the fourth (unused) column
exists because multiplying by 3 is hard
but multiplying by 4 is easy (in base 2
anyway). The three columns correspond
to the three pairs of 2-bit values in
those first $56 6-bit bytes. Since the
values are only 2 bits wide, each
column holds one of four different
values (%00, %01, %10, or %11).

The second table, at $0300..$0369, is
the "pre-shift" table. This contains
all the possible 6-bit bytes, in order,
each multiplied by 4 (a.k.a. shifted to
the left two places, so the 6 bits that
started in columns 0-5 are now in
columns 2-7, and columns 0 and 1 are
zeroes). Like this:

       00ghijkl   -->   ghijkl00

Astute readers will notice that there
are only 64 possible 6-bit bytes, but
this second table is larger than 64
bytes. To make lookups easier, the
table has empty slots for each of the
invalid raw nibbles. In other words, we
don't do any math to decode raw nibbles
into 6-bit bytes; we just look them up
in this table (offset by $96, since
that's the lowest valid raw nibble) and
get the required bit shifting for free.


addr | raw |  decoded 6-bit | pre-shift
-----+-----+----------------+----------
$300 | $96 |  0 = %00000000 | %00000000
$301 | $97 |  1 = %00000001 | %00000100
$302 | $98        [invalid raw nibble]
$303 | $99        [invalid raw nibble]
$304 | $9A |  2 = %00000010 | %00001000
$305 | $9B |  3 = %00000011 | %00001100
$306 | $9C        [invalid raw nibble]
$307 | $9D |  4 = %00000100 | %00010000
  .
  .
  .
$368 | $FE | 62 = %00111110 | %11111000
$369 | $FF | 63 = %00111111 | %11111100


Each value in this "pre-shift" table
also serves as an index into the first
table (with all the 2-bit bytes). This
wasn't an accident; I mean, that sort
of magic doesn't just happen. But the
table of 2-bit bytes is arranged in
such a way that we take one of the raw
nibbles that needs to be decoded and
split apart (from the first $56 raw
nibbles in the data field), use that
raw nibble as an index into the pre-
shift table, then use that pre-shifted
value as an index into the first table
to get the 2-bit value we need. That's
a neat trick.

; this loop creates the pre-shift table
; at $300
084E-   A2 40       LDX   #$40
0850-   A4 58       LDY   $58
0852-   98          TYA
0853-   0A          ASL
0854-   24 58       BIT   $58
0856-   F0 12       BEQ   $086A
0858-   05 58       ORA   $58
085A-   49 FF       EOR   #$FF
085C-   29 7E       AND   #$7E
085E-   B0 0A       BCS   $086A
0860-   4A          LSR
0861-   D0 FB       BNE   $085E
0863-   CA          DEX
0864-   8A          TXA
0865-   0A          ASL
0866-   0A          ASL
0867-   99 EA 02    STA   $02EA,Y
086A-   C6 58       DEC   $58
086C-   D0 E2       BNE   $0850

And this is the result (".." means the
address is uninitialized and unused):

0300- 00 04 .. .. 08 0C .. 10
0308- 14 18 .. .. .. .. .. ..
0310- 1C 20 .. .. .. 24 28 2C
0318- 30 34 .. .. 38 3C 40 44
0320- 48 4C .. 50 54 58 5C 60
0328- 64 68 .. .. .. .. .. ..
0330- .. .. .. .. .. 6C .. 70
0338- 74 78 .. .. .. 7C .. ..
0340- 80 84 .. 88 8C 90 94 98
0348- 9C A0 .. .. .. .. .. A4
0350- A8 AC .. B0 B4 B8 BC C0
0358- C4 C8 .. .. CC D0 D4 D8
0360- DC E0 .. E4 E8 EC F0 F4
0368- F8 FC

; this loop creates the table of 2-bit
; values at $200, magically arranged to
; enable easy lookups later
086E-   46 BA       LSR   $BA
0870-   46 BA       LSR   $BA
0872-   B5 EA       LDA   $EA,X
0874-   99 FF 01    STA   $01FF,Y
0877-   E6 AF       INC   $AF
0879-   A5 AF       LDA   $AF
087B-   25 BA       AND   $BA
087D-   D0 05       BNE   $0884
087F-   E8          INX
0880-   8A          TXA
0881-   29 03       AND   #$03
0883-   AA          TAX
0884-   C8          INY
0885-   C8          INY
0886-   C8          INY
0887-   C8          INY
0888-   C0 04       CPY   #$04
088A-   B0 E6       BCS   $0872
088C-   C8          INY
088D-   C0 04       CPY   #$04
088F-   90 DD       BCC   $086E

And this is the result:

0200- 00 00 00 .. 00 00 02 ..
0208- 00 00 01 .. 00 00 03 ..
0210- 00 02 00 .. 00 02 02 ..
0218- 00 02 01 .. 00 02 03 ..
0220- 00 01 00 .. 00 01 02 ..
0228- 00 01 01 .. 00 01 03 ..
0230- 00 03 00 .. 00 03 02 ..
0238- 00 03 01 .. 00 03 03 ..
0240- 02 00 00 .. 02 00 02 ..
0248- 02 00 01 .. 02 00 03 ..
0250- 02 02 00 .. 02 02 02 ..
0258- 02 02 01 .. 02 02 03 ..
0260- 02 01 00 .. 02 01 02 ..
0268- 02 01 01 .. 02 01 03 ..
0270- 02 03 00 .. 02 03 02 ..
0278- 02 03 01 .. 02 03 03 ..
0280- 01 00 00 .. 01 00 02 ..
0288- 01 00 01 .. 01 00 03 ..
0290- 01 02 00 .. 01 02 02 ..
0298- 01 02 01 .. 01 02 03 ..
02A0- 01 01 00 .. 01 01 02 ..
02A8- 01 01 01 .. 01 01 03 ..
02B0- 01 03 00 .. 01 03 02 ..
02B8- 01 03 01 .. 01 03 03 ..
02C0- 03 00 00 .. 03 00 02 ..
02C8- 03 00 01 .. 03 00 03 ..
02D0- 03 02 00 .. 03 02 02 ..
02D8- 03 02 01 .. 03 02 03 ..
02E0- 03 01 00 .. 03 01 02 ..
02E8- 03 01 01 .. 03 01 03 ..
02F0- 03 03 00 .. 03 03 02 ..
02F8- 03 03 01 .. 03 03 03 ..

And now for something completely
different. The original disk briefly
displayed an uninitialized hi-res
graphics page (originally at $0801 --
literally the first thing it does on
boot). So I want to do the same. It
won't be absolutely first thing, but
it'll be close.

0891-   2C 54 C0    BIT   $C054
0894-   2C 52 C0    BIT   $C052
0897-   2C 57 C0    BIT   $C057
089A-   2C 50 C0    BIT   $C050
089D-   60          RTS

[Note to future self: $0891..$08FF is
 available for game-specific init code,
 but it can't rely on or disturb zero
 page in any way. That rules out a lot
 of built-in ROM routines; be careful.
 If the game needs no initialization,
 you can zap this entire range and put
 an "RTS" at $0891.]

Everything else is already lined up on
the stack. All that's left to do is
"return" and let the stack guide us
through the rest of the boot.

                   ~

              Chapter 11
              0boot boot1


The rest of the boot runs from zero
page. It's hard to show you exactly
what boot1 will look like, because it
relies heavily on self-modifying code.

In a standard DOS 3.3 RWTS, the
softswitch to read the data latch is
"LDA $C08C,X", where X is the boot slot
times 16 (to allow disks to boot from
any slot). 0boot also supports booting
from any slot, but instead of using an
index, each fetch instruction is pre-
set based on the boot slot. We only
need to set this up once, because we're
only going to read from the disk once.

Not only does this free up the X
register, it lets us juggle all the
registers and put the raw nibble value
in whichever one is convenient at the
time. (We take full advantage of this
freedom.) I've marked each pre-set
softswitch with "o_O" to remind you
that self-modifying code is awesome.

There are several other instances of
addresses and constants that get
modified while boot1 is running. I've
marked these with "/!\" to remind you
that self-modifying code is dangerous
and you should not try this at home.

The first thing popped off the stack is
the drive arm move routine at $00B6. It
moves the drive exactly one phase (half
a track).

00B7-   E6 BA       INC   $BA

; This value was set at $00B7 (above).
; It's incremented monotonically, but
; it's ANDed with $03 later, so its
; exact value isn't relevant.
00B9-   A0 3F       LDY   #$3F      /!\

; short wait for PHASEON
00BB-   A9 04       LDA   #$04
00BD-   20 C3 00    JSR   $00C3

; fall through
00C0-   88          DEY

; longer wait for PHASEOFF
00C1-   69 41       ADC   #$41
00C3-   85 CE       STA   $CE

; calculate the proper stepper motor to
; access
00C5-   98          TYA
00C6-   29 03       AND   #$03
00C8-   2A          ROL
00C9-   AA          TAX

; This address was set at $0827,
; based on the boot slot.
00CA-   BD D1 C0    LDA   $C0D1,X   /!\

; This value was set at $00C3 so that
; PHASEON and PHASEOFF have optimal
; wait times.
00CD-   A9 D1       LDA   #$D1      /!\

; wait exactly the right amount of time
; after accessing the proper stepper
; motor
00CF-   4C A8 FC    JMP   $FCA8

Since the drive arm routine only moves
one phase, it was pushed to the stack
twice before each track read. Our game
is stored on whole tracks; this half-
track trickery is only to save a few
bytes of code in boot1.

The track read routine starts at $0001,
because that let us save 1 byte in the
boot0 code when we were pushing
addresses to the stack. (We could just
push $00 twice.)

; sectors-left-to-read-on-this-track
; counter (incremented to $00)
0001-   A2 F0       LDX   #$F0
0003-   2C A2 FB    BIT   $FBA2
0006-   86 00       STX   $00

Pay no attention to the BIT instruction
at $0003, which just happens to hide a
whole other instruction at $0004. We
will return to this later.

We initialize an array at $00DE that
tracks which sectors we've read from
the current track. Astute readers will
notice that this part of zero page had
real data in it -- some addresses that
were pushed to the stack, and some
other values that were used to create
the 2-bit table at $0200. All true, but
all those operations are now complete,
and the space from $00DE..$00FF is now
available for unrelated uses.

The array is in physical sector order,
thus the RWTS assumes data is stored in
physical sector order on each track.
(This is why my MAKE program had to map
to physical sector order when writing.
This saves 18 bytes: 16 for the table
and 2 for the lookup command!) Values
are the actual pages in memory where
that sector should go, and they get
zeroed once the sector is read (so we
don't waste time decoding the same
sector twice).

; starting address (game-specific;
; this one starts loading at $0B00)
0008-   A9 0B       LDA   #$0B      /!\
000A-   95 EE       STA   $EE,X
000C-   E6 09       INC   $09
000E-   E8          INX
000F-   D0 F7       BNE   $0008

0011-   20 D2 00    JSR   $00D2

; subroutine reads a nibble and
; stores it in the accumulator
00D2-   AD D1 C0    LDA   $C0D1     o_O
00D5-   10 FB       BPL   $00D2
00D7-   60          RTS

Continuing from $0014 (wow that sounds
weird, doesn't it?)...

; first nibble must be $D5
0014-   C9 D5       CMP   #$D5
0016-   D0 F9       BNE   $0011

; read second nibble, must be $AA
0018-   20 D2 00    JSR   $00D2
001B-   C9 AA       CMP   #$AA
001D-   D0 F5       BNE   $0014

; We actually need the Y register to be
; $AA for unrelated reasons later, so
; let's set that now. (We have time,
; and it saves 1 byte!)
001F-   A8          TAY

; read the third nibble
0020-   20 D2 00    JSR   $00D2

; is it $AD?
0023-   49 AD       EOR   #$AD

; Yes, which means this is the data
; prologue. Branch forward to start
; reading the data field.
0025-   F0 1F       BEQ   $0046

If that third nibble is not $AD, we
assume it's the end of the address
prologue. ($96 would be the third
nibble of a standard address prologue,
but we don't actually check.) We fall
through and start decoding the 4-4
encoded values in the address field.

0027-   A0 02       LDY   #$02

The first time through this loop,
we'll read the disk volume number.
The second time, we'll read the track
number. The third time, we'll read
the physical sector number. We don't
actually care about the disk volume or
the track number, and once we get the
sector number, we don't verify the
address field checksum.

0029-   20 D2 00    JSR   $00D2
002C-   2A          ROL
002D-   85 AF       STA   $AF
002F-   20 D2 00    JSR   $00D2
0032-   25 AF       AND   $AF
0034-   88          DEY
0035-   10 F2       BPL   $0029

; store the physical sector number
; (will re-use later)
0037-   85 AF       STA   $AF

; use physical sector number as an
; index into the sector address array
0039-   A8          TAY

; get the target page (where we want to
; store this sector in memory)
003A-   B6 DE       LDX   $DE,Y

; store the target page in several
; places throughout the following code
003C-   86 9E       STX   $9E
003E-   CA          DEX
003F-   86 6E       STX   $6E
0041-   86 86       STX   $86
0043-   E8          INX

; This is an unconditional branch,
; because the ROL at $002C will always
; set the carry. We're done processing
; the address field, so we need to loop
; back and wait for the data prologue.
0044-   B0 CB       BCS   $0011

; execution continues here (from $0025)
; after matching the data prologue
0046-   E0 00       CPX   #$00

; If X is still $00, it means we found
; a data prologue before we found an
; address prologue. In that case, we
; have to skip this sector, because we
; don't know which sector it is and we
; wouldn't know where to put it. Sad!
0048-   F0 C7       BEQ   $0011

Nibble loop #1 reads nibbles $00..$55,
looks up the corresponding offset in
the preshift table at $0300, and stores
that offset in the temporary buffer at
$036A.

; initialize rolling checksum to $00
004A-   85 58       STA   $58
004C-   AE D1 C0    LDX   $C0D1      o_O
004F-   10 FB       BPL   $004C

; The nibble value is in the X register
; now. The lowest possible nibble value
; is $96 and the highest is $FF. To
; look up the offset in the table at
; $0300, we need to subtract $96 from
; $0300 and add X.
0051-   BD 6A 02    LDA   $026A,X

; Now the accumulator has the offset
; into the table of individual 2-bit
; combinations ($0200..$02FF). Store
; that offset in the temporary buffer
; at $036A, in the order we read the
; nibbles. But the Y register started
; counting at $AA, so we need to
; subtract $AA from $036A and add Y.
0054-   99 C0 02    STA   $02C0,Y

; The EOR value is set at $004A
; each time through loop #1.
0057-   49 7F       EOR   #$7F      /!\
0059-   C8          INY
005A-   D0 EE       BNE   $004A

Here endeth nibble loop #1.

Nibble loop #2 reads nibbles $56..$AB,
combines them with bits 0-1 of the
appropriate nibble from the first $56,
and stores them in bytes $00..$55 of
the target page in memory.

005C-   A0 AA       LDY   #$AA
005E-   AE D1 C0    LDX   $C0D1     o_O
0061-   10 FB       BPL   $005E
0063-   5D 6A 02    EOR   $026A,X
0066-   BE C0 02    LDX   $02C0,Y
0069-   5D 02 02    EOR   $0202,X

; This address was set at $003F
; based on the target page (minus 1
; so we can add Y from $AA..$FF).
006C-   99 56 D1    STA   $D156,Y   /!\
006F-   C8          INY
0070-   D0 EC       BNE   $005E

Here endeth nibble loop #2.

Nibble loop #3 reads nibbles $AC..$101,
combines them with bits 2-3 of the
appropriate nibble from the first $56,
and stores them in bytes $56..$AB of
the target page in memory.

0072-   29 FC       AND   #$FC
0074-   A0 AA       LDY   #$AA
0076-   AE D1 C0    LDX   $C0D1     o_O
0079-   10 FB       BPL   $0076
007B-   5D 6A 02    EOR   $026A,X
007E-   BE C0 02    LDX   $02C0,Y
0081-   5D 01 02    EOR   $0201,X

; This address was set at $003C
; based on the target page (minus 1
; so we can add Y from $AA..$FF).
0084-   99 AC D1    STA   $D1AC,Y   /!\
0087-   C8          INY
0088-   D0 EC       BNE   $0076

Here endeth nibble loop #3.

Loop #4 reads nibbles $102..$155,
combines them with bits 4-5 of the
appropriate nibble from the first $56,
and stores them in bytes $AC..$FF of
the target page in memory.

008A-   29 FC       AND   #$FC
008C-   A2 AC       LDX   #$AC
008E-   AC D1 C0    LDY   $C0D1     o_O
0091-   10 FB       BPL   $008E
0093-   59 6A 02    EOR   $026A,Y
0096-   BC BE 02    LDY   $02BE,X
0099-   59 00 02    EOR   $0200,Y

; This address was set at $003C
; based on the target page.
009C-   9D 00 D1    STA   $D100,X   /!\
009F-   E8          INX
00A0-   D0 EC       BNE   $008E

Here endeth nibble loop #4.

; Finally, get the last nibble,
; which is the checksum of all
; the previous nibbles.
00A2-   29 FC       AND   #$FC
00A4-   AC D1 C0    LDY   $C0D1     o_O
00A7-   10 FB       BPL   $00A4
00A9-   59 6A 02    EOR   $026A,Y

; if checksum fails, start over
00AC-   D0 96       BNE   $0044

; This was set to the physical
; sector number (at $0037), so
; this is a index into the 16-
; byte array at $00DE.
00AE-   A0 00       LDY   #$00      /!\

; store $00 at this index in the sector
; array to indicate that we've read
; this sector
00B0-   96 DE       STX   $DE,Y

; are we done yet?
00B2-   E6 00       INC   $00

; nope, loop back to read more sectors
00B4-   D0 8E       BNE   $0044

; And that's all she read.
00B6-   60          RTS

0boot's track read routine is done when
$0000 hits $00, which is astonishingly
beautiful. Like, "now I know God" level
of beauty.

And so it goes: we pop another address
off the stack, move the drive arm, read
another track. Eventually we get to the
$0003 address we pushed to the stack in
boot0. That "returns" to $0004, which
looks like this:

; game-specific number of sectors to
; load from final track (subtracted
; from $FF, so 5 sectors -- they'll go
; into $BB00..$BFFF in memory)
0004-   A2 FB       LDX   #$FB
0006-   86 00       STX   $00

That was hidden in plain sight this
entire time, inside the BIT instruction
at $0003. I told you we'd return to it.

After reading 5 sectors from the final
track, we hit the "RTS" at $00B6 again,
burn through the machine initialization
routines we pushed to the stack (PR#0,
IN#0), then pop off one last address
and continue at $00D8:

; turn off drive motor
00D8-   AD D1 C0    LDA   $C0D1     /!\

; jump to game-specific loader
00DB-   4C 00 0B    JMP   $0B00

And that's all she wrote^H^H^H^H^Hread.

Quod erat liberandum.

                   ~

               Changelog

2020-06-24

- typo in the 6-and-2 encoding diagram
  [thanks Andrew R.]

2016-10-16

- typos (thanks qkumba)

2016-10-10

- initial release

---------------------------------------
A 4am crack                     No. 872
------------------EOF------------------
